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Abstract 

In this paper we develop the theory of parametric polynomial regression in Riemannian manifolds and Lie groups. 
We show application of Riemannian polynomial regression to shape analysis in Kendall shape space. Results are 
presented, showing the power of polynomial regression on the classic rat skull growth data of Bookstein as well as 
the analysis of the shape changes associated with aging of the corpus callosum from the OASIS Alzheimer's study. 

1 Introduction 

The study of the relationship between measured data and descriptive variables is known as the field of regression 
analysis. As with most statistical techniques, regression analysis can be broadly divided into two classes: parametric 
and non-parametric. The most widely known parametric regression methods are linear and polynomial regression 
in Euclidean space, wherein a linear or polynomial function is fit in a least-squares fashion to observed data. Such 
methods are the staple of modern data analysis. However, classical regression suffers from the fundamental limitation 
that the data must lie in a Euclidian vector space. The most common non-parametric regression approaches are 
kernel-based methods and spline smoothing approaches which provide much more flexibility in the class of regression 
functions. However, their non-parametric nature presents a challenge to inference problems, for instance if one wishes 
to perform a hypothesis test to determine whether the trend for one group of data is significantly different from that of 
another group. 

Fundamental to the analysis of anatomical imaging data within the framework of computational anatomy is the 
analysis of transformations and shape which are best represented as elements of Remaninan Manifolds rather than Eu- 
clidian vector spaces. In previous work, non-parametric kernel-based or spline-based methods have been extended to 
observations that lie on a Remannian manifold with some success [Davis et al., 2010, Jupp and Kent, 1987], but intrin- 
sic parametric regression on Riemannian manifolds has received limited attention. Most recently, Fletcher [Fletcher, 201 1] 
and Niethammer et al. [Niethammer et al., 201 1] have each independently developed geodesic regression which gen- 
eralizes the notion of linear regression to Riemannian manifolds. 

The goal of the current work is to extend such work in order to accommodate more flexibility in the model while 
remaining in the parametric setting. The increased flexibility introduced by the methods in this manuscript allow a 
better description of the variability in the data, and ultimately will allow more powerful statistical inference. Our work 
builds off that of Jupp & Kent [Jupp and Kent, 1987], whose method for fitting parametric curves to the sphere involved 
intricate unwrapping and rolling processes. The work presented in this paper allows one to fit regression curves on a 
general Riemannian manifold, using intrinsic methods and avoiding the need for unwrapping and unrolling. 

We demonstrate the usefulness of our algorithm in three studies of shape. By applying our algorithm to Bookstein's 
classical rat skull growth dataset [Bookstein, 1991], we show that we are able to obtain a parametric regression curve 
of similar quality to that produced by non-parametric methods [Kent et al., 2001]. We also demonstrate in a 2D corpus 
callosum aging study that, in addition to providing more flexibility in the traced path, our polynomial model provides 
information about the optimal parametrization of the time variable. 
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2 Methods 



2.1 Preliminaries 

Let (M,g) be a Riemannian manifold [do Carmo, 1992]. For each point p € M, the metric 5 determines an inner 
product on the tangent space T p M as well as a way to differentiate vector fields X, Y with respect to one another. 
That derivative is referred to as the covariant derivative and is denoted V xY ■ If V € X(M ) = {/ e C°°(M, TM) : 
f(p) £ T p M, Vp 6 M} is a smooth vector field on M and 7 : [0, T] — > M is a smooth curve on M then the covariant 
derivative of V along 7 is the time derivative of V in a reference frame along 7 

= jv^m (i) 

where the intrinsic derivative % is determined by the metric. Geodesies 7 : [0, T] — > M are characterized by the 
second-order covariant differential equation (c.f. [do Carmo, 1992]) 

V^ 7 - 0. (2) 

This equation, called the geodesic equation, uniquely determines geodesies up to a choice of initial conditions (7(0), 7(0)) £ 
TM. The mapping from the tangent bundle into the manifold is called the exponential map, Exp : TM — > M. Fixing 
the base point p 6 M, the exponential map is injective on a zero-centered ball B in T p M of some non-zero (possibly 
infinite) radius. Thus for a point q in the image of B under Exp p there exists a unique vector v 6 T p M corresponding 
to a minimal length path under the exponential map from p to q. The mapping of such points q to their associated 
tangent vectors v at p is called the log map of q at p, denoted v — Log p q. 

Given a curve 7 : [0, T] —> M we'll want to relate tangent vectors at different points along the curve. These 
relations are governed infinitesimally by the covariant derivative V-y. A vector field X : [0, T] —> TjmM along the 
curve 7 is parallel transported along 7 if it satisfies the parallel transport equation: 

VjX(t) = (3) 

for all times t £ [0, T]. Notice that the geodesic equation is a special case of parallel transport, in which we require 
that the velocity is parallel transported along the curve itself. 



2.2 Riemannian Polynomials 

Given a vector field X along a curve 7, the covariant derivative of X gives us a way to define vector fields which are 
"constant" along 7, as parallel transported vectors. Geodesies are generalizations to the Riemannian manifold setting 
of curves with constant first derivative. 

The covariant derivative of a vector field along a curve gives another vector field along the curve. We will apply 
that derivative repeatedly to examine curves with constant higher derivatives. For instance, we refer to the vector field 
V-y(t)7(£) as the acceleration of the curve 7. Curves with constant acceleration are generalizations of quadratic curves 
in R and satisfy the second order polynomial equation 

(V^) 2 7(i) = 0, (4) 
with initial conditions: 7(0), 7(0) and 7(0). 

Extending this idea, a cubic polynomial is defined as curve having constant jerk (time derivative of acceleration), and 
so on. Generally, a fcth order polynomial in M is defined as a curve 7 : [0, T] —> M satisfying 

(Vy) fc 7 W - (5) 
with initial conditions: 7(0) and 7*(0), i = 1, • • ■ ,k, 

for all time points t € [0, T]. As with polynomials in Euclidian space, a fc th order polynomial is determined by the 
(k + 1) initial conditions at t = 0. 
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The covariant differential equation governing the evolution of Riemannian polynomials is linearized in the same 
way that a Euclidian ordinary differential equation is. Introducing vector fields v\(t), . . . , Vk(t) € T^mM, we can 
write the system of covariant differential equations as 



j(t)= Vl (t) 



(6) 
(7) 



V 7 Ufc_i(t) = v k (t) 
V^v k {t) = 0. 



(8) 
(9) 
(10) 



Algorithm 1 Pseudocode for forward integration of k order Riemannian polynomial 

7 <- 7(0) 

for i = 1, . . . ,k do 

Vi <- Uj(0) 
end for 
f <- 
repeat 

to <— ui 

for i = 1, . . . , k — 1 do 

Vi <— ParallelTransport 7 (Atu;, + AtUi+i) 
end for 

Vk <— ParallclTransport 7 (Atw, «fc) 
7 <- Exp 7 (Aiw) 
< <- i + At 
until t=T 




Figure 1: Sample polynomial curves emanating from a common basepoint (green) on the sphere 
(black=geodesic,blue=quadratic,red=cubic). 

The Riemannian polynomial equation cannot, in general, be solved in closed form, and must be integrated nu- 
merically. In order to discretize this system of covariant differential equations, we implement the covariant integrator 
depicted in Alg. 1. At each step of the integrator, each vector is incremented within the tangent space at j(t) and the 
results are parallel transported infinitesimally along a geodesic from 7(f) to j(t + At). The only ingredients necessary 
to integrate a polynomial forward in time are the exponential map and parallel transport on the manifold. 

Figure 1 shows the result of integrating polynomials of order one, two, and three. Initial velocity, acceleration, and 
jerk were chosen and a cubic polynomial integrated to obtain the blue curve. Then the initial jerk was set to zero and 
the blue quadratic curve was integrated, followed by the black geodesic whose acceleration was set to zero. 
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2.3 Estimation of Parameters for Regression 

In order to regress polynomials against observed data yi 6 M, i = 1, .. . , N, we define the following constrained 
objective function 

1 N 

£o(7(0), «i(0),..., Ufe (0)) = -J2d(~f(U), yi ) 2 (11) 

i=l 

subject to: -y(t) = i7i(t) (12) 

VyV 1 (t)=V 2 (t) (13) 

: (14) 

VjV k - 1 (t) = v k (t) (15) 
V^ fc (t) = 0. (16) 

which is minimized in order to find the optimal initial conditions 7(0), Ui(0), i — 1, . . . , k, which we will refer to as 
the parameters of our model. 

In order to determine the optimal parameters 7(0), Uj(0), i = 1, . . . , k, we introduce Lagrange multiplier vector 
fields A; £ X(M), i = 0, . . . , k, often called the adjoint variables, and define the unconstrained objective function 



1 N 

E( 7 (0),v(0),X) = - ]T d(j(U), Vl ) 2 (17) 
(Ao(i),7(*)-«i(*)>* ( 18 > 

fe-l „t 

V/ (\i(t),VyVi(t)-V i+ l(t))dt (19) 

i=l - 70 

T 

(X k (t),W^v k (t))dt. (20) 





As is standard practice, the optimality conditions for this equation are obtained by taking variations with respect to all 
arguments of E, integrating by parts when necessary. The resulting variations with respect the the adjoint variables 
yield the original dynamic constraints: the polynomial equations. Variations with respect to the primal variables gives 
rise to the following system of equations, termed the adjoint equations. The adjoint equations take the following form 
(see appendix for derivation): 

2 N k 
V^A = -— ^^(t-< l )Log 7 y i - ^R{vi,Xi)vx (21) 

i=l i=l 

V^Ax - -A (22) 

: (23) 
V^A fe = -A fc _i, (24) 

where R is the Riemannian curvature tensor and the Dirac 8 functional indicates that the order zero adjoint variable 
takes on jump discontinuities at time points where data is present. Gradients of E with respect to initial and final 
conditions give rise to the terminal endpoint conditions for the adjoint variables, as well as expressions for the gradients 
with respect to initial conditions. 

Ai(T) = 0,i = 0,--- ,k (25) 
5 7{0) E = -Ao(0) (26) 
S vm E = -X i (0) (27) 

In order to determine the value of the adjoint vector fields at t = 0, and thus the gradients of our functional Eq, the 
adjoint variables are first initialized to zero at time T, then the system Eq. 24 is evolved backward in time to t = 0. 
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Given the gradients with respect to the initial conditions, a simple steepest descent algorithm is used to optimize 
the functional. The update to 7(0) is computed using the exponential map, and the vectors i>j(0) are updated via 
parallel translation. 

Note that in the special case of a zero-order polynomial (k — 0), the only gradient Ao is simply the mean of the log 
map vectors at the current estimate of the Frechet mean. So this method generalizes the common method of Frechet 
averaging on manifolds [Fletcher et al., 2004]. 

The curvature term, in the case k = 1, indicates that Ai is a sum of adjoint Jacobi fields. So this approach 
subsumes geodesic regression as presented by Fletcher [Fletcher, 2011]. For higher order polynomials, the adjoint 
equations represent a generalized Jacobi system. 

In practice, it is often not necessary to explicitly compute the curvature terms. In the case that the manifold M 
embeds into a Hilbert space, the extrinsic adjoint equations can be computed by taking variations in the ambient 
space, using standard methods. Such an approach gives rise to the regression algorithm found in Niethammer et al. 
[Niethammer et al., 201 1], for example. 



Algorithm 2 Pseudocode for reverse integration of adjoint equations for k order Riemannian polynomial 

7 <- W) 

for i = 0, . . . , k do 

end for 

t <- T 
repeat 

w <— Vi(t) 

Ao <- Ao + AtEi=ii2(wi,Ai)«i 
if t = ti then 

A <- A + jj Log 7 yi 
end if 

for i = k, . . . , 1 do 

A; <- ParallelTransport 7 (— Atw, \ t + AtA»_i) 
end for 

Ao ParallelTransport 7 (— Atw, Ao) 
7 <- Exp 7 (-Atw) 
t<-t-At 
until t=0 

8y(o)E < ^0 

for i = 1, . . . , k do 

$Vi(0)E < 

end for 



2.4 Time Reparametrization 

In the geodesic model, curves propagate at a constant speed, a result of their extremal action property. However, using 
higher-order polynomials it is possible to generate curves whose images match those of a geodesic, but whose time- 
dependence has been reparametrized. If the initial conditions consist of all Vi collinear, then this will necessarily be 
the case. Polynomials provide flexibility not only in the class of paths that are possible, but in the time dependence of 
the curves traversing those paths. Regression models could even be imagined in which the operator wishes to estimate 
geodesic paths, but is unsure of parametrization, and so enforces the estimated parameters to be collinear. 

2.5 Polynomials on Riemannian Lie Groups 

A special case arises when the manifold is a Lie group and the metric is left-invariant. In this case, left-translation by 7 
and 7~\ denoted L 7 , L 7 -i, allow one to isometrically represent tangent vectors at 7 as tangent vectors at the identity, 
using the pushforward The tangent space at the identity is isomorphically identified with the Lie algebra g. 

The algebraic operation in g is called the Lie bracket [, ] : g x g —> g. For X.Y e q, this bracket operation, also 
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called the adjoint action of X on Y and often denoted adx Y — [X, Y], is bilinear and alternating in its arguments 
(i.e. adx Y = — ady X). Fixing X, the operator adx is linear and its adjoint is computed with respect to the metric: 



(adxYZ) = (YM X Z). (28) 

We will refer to the operator ad^- as the adjoint-transpose action of X. 

Suppose X is a vector field along the curve 7 and X c = L~-i*X G g is its representative at the identity. Similarly, 
uj c = L 7 -i + 7 is the representative of the curve's velocity in the Lie algebra. Then the covariant derivative of X along 
7 evolves in the Lie algebra via 



L^V^X = X c -\ (ad* X c + ad x lj c - ad Wc X c ) . (29) 



In the special case when X = 7 so that X c = cu c , if we set this covariant derivative equal to zero we have the geodesic 
equation: 



id* w CJ (30) 



which in this form is often refered to as the Euler-Poincare equation [Arnol'd, 1989, Holm et al., 1998]. The curvature 
tensor is given by the standard formula 

R{X, Y)Z = V x VyZ - Vy^xZ - V [X<Y] Z. (31) 

Applying Eq. 29, in a Lie group with left-invariant metric, these terms are computed in the Lie algebra using the 
formulas: 



V x Vy Z = - ( ad adx y Z — ad x adt, Y - ad x ad Y Z (32) 
-ad f , 7 X + ad f t X + ad f t X (33) 

ady Z ad^ Y ad], Z v ' 

- ad x ady Z + ad x ad f z Y + ad x ad Y Z) (34) 
V [x ,y] Z = - (ad adx y Z - ad f z ad x Y - ad^ x Y z) . (35) 

Also note that if the metric is both left- and right-invariant, then the adjoint-transpose action is alternating and we 
can simplify the covariant derivative further. In particular, in the presence of such a bi-invariant metric, the first few 
covariant derivatives take the following forms: 

Ly-i* (^j) k i = (Jjj + \ ad "c^J W c (36) 

L 7 -i*V^7 = cJ c (37) 

2 1 

L 7 -i*(V^) 7 = cJ c + - ad Wc ui c (38) 

3 1 

L 7 -i* (V 7 ) 7 = ti' c + ad Uc ui c + - ad Uc ad W(! cJ c . (39) 

This allows us to solve the forward evolution in the convenient setting of the vector space g. Parallel transport requires 
solution of the equation 

j t + ad^ X = 0, (40) 

which can be integrated using Euler integration or similar time-stepping algorithms, so long as the adjoint action can 
be easily computed. Because of the simplified covariant derivative formula for a bi-invariant metric, the curvature 
takes the simple form [do Carmo, 1992]: 

R{X,Y)Z =^ad z ad x Y (41) 
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2.6 Coefficient of Determination for Regression in Metric Spaces 

It will be useful to define a statistic which will indicate how well our model fits some set of observed data. As in 
[Fletcher, 2011], we compute the coefficient of determination of our regression polynomial j(t), denoted R 2 . The 
first step to computing R 2 is to compute the variance of the data. As discussed in [Fletcher, 201 1], the natural choice 
of total variance statistic is the Frechet variance, defined by 

1 N 

Var{y 4 } = - min £ d(y, yi f. (42) 

i— 1 

Note that the Frechet mean y itself is the 0-th order polynomial regression against the data {j/j} and the variance is 
the value of the objective function Eq at that point. We also define the sum of squared error for a curve 7 as the value 

£0(7): 



1 N 

SSE=-J2dh(U), yi ) 2 (43) 

Then the coefficient of determination is defined as 

R 2 = 1 - SSE (44) 
Var{yJ 

Any regressed polynomial will have SSE < Var{yj}, R 2 will have a value between and 1, with 1 indicating a 
perfect fit and indicating that the curve 7 provides no better fit than does the Frechet mean. 

3 Examples 

3.1 The n-Dimensional Sphere 

Suppose M = S n — {p G K™ +1 : \\p\\ = 1}. Then geodesies are great circles and the exponential map is given by 

v 

Exp v = cosO ■ p + sin 6*7---, 9 — \\v\\ (45) 
H 

The corresponding log map is the inverse of this function: 

Log p q = 9 *~ ( j pT T q \ P u , d = co S - 1 (p T q). (46) 

The Riemannian curvature tensor is [do Carmo, 1992] 

R(X, Y)Z = (X T Z)Y - (Y T Z)X. (47) 

Parallel translation of a vector X along the exponential map of a vector v is performed as follows. The vector X is 
decomposed into a vector parallel to v, which we denote X v , and a vector orthogonal to v, X 1 - . X 1 - is left unchanged 
by parallel transport along Exp p v, while X v transforms in a similar way as v: 

\ T 
V \ V 



X v 1 — } [ — sin 9p + cos 6- — - ) - — —X v . (48) 

V ' w\) \\n 

Using this exponential map and parallel transport, the integrator depicted in Alg. 1 was implemented. Figure 1 shows 
the result of integrating polynomials of order one, two, and three, using the equations from this section, for the special 
case of a two-dimensional sphere. 
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3.2 The Lie Group SO (3) 

In order to illustrate the algorithm in a Lie group, we consider the Lie group SO (3) of orthogonal 3-by-3 real matrices 
with determinant one. We review the basics of SO (3) here, but the reader is encouraged to consult [Arnol'd, 1989] for 
a full treatment of SO(3) and derivation of the Euler-Poincare equation there. 

The Lie algebra for the rotation group is so(3), the set of all skew-symmetric 3-by-3 real matrices. Such matrices 
can be identified with vectors in M 3 by the identification 

a 

b I i y <r{x) = I c -a I (49) 
c ) \ -b a / 

under which the Lie bracket takes the convenient form of the cross-product of vectors: 

x x y = a(x)y i-> a(x X y) = [a(x),a(y)]. (50) 

With the W 3 representation of vectors in so (3), an invariant metric is determined by the choice of a symmetric, positive- 
definite matrix A: 

(x, y) — x T Ay,^x, y 6 so(3) = R 3 . (51) 
With this metric the adjoint-transpose action can be written as 

adj. y = -A- 1 (x x Ay) . (52) 
If we plug this into the general Lie group formula we obtain the geodesic equation in the Lie algebra: 

lu c = -A _1 (w c x Alj c ) (53) 

which must be integrated numerically to find oj c at each time. To compute the exponential map Exp p v, the algorithm 
depicted in Alg. 1 is employed. The matrix v is converted to a skew-symmetric matrix by multiplication by p — 1 , which 
translates v back into the Lie algebra. The resulting vector is converted to a vector ui c (0). The geodesic equation is 
then integrated to find lu c at future times. The point p then evolves along the exponential map by time-stepping, where 
at each step oj c is converted back into a skew-symmetric matrix W, multiplied by At and exponentiated to obtain a 
rotation matrix. The Euler step is then given by left-multiplication by the resulting matrix. 
Plugging into Eq. 35, the terms need to compute the curvature of SO (3) are 

V^VyZ = -((X xY)x Z + X x A~ X (Z x AY) + X x A~\Y x AZ) (54) 

+ A -1 ((Y x Z) x AX) + x AY) x AX) + A _1 (A -1 (Y x AZ) x AX) (55) 

+ A^{X x A(Y x Z)) + A-\X x (Z x AY)) + A^ 1 (X x (Y x AZ))) (56) 

V [x ,y]Z = - ((X x Y) x Z + A~ 1 (Z x A(X x Y)) + A^^X xY)x AZ)) , (57) 
and the parallel transport equation is given by 

X c = - (A- 1 (-X c x Auj c -lu c x AX C ) -uj c x X c ) . (58) 
The evolution equations are then integrated numerically as in Alg. 1 . 



3.3 Kendall Shape Space 

The analysis of shape is a common challenge in medical imaging. Shape is defined as the property of the data that 
is invariant to scale or rotation. It was in this setting that Kendall [Kendall, 1984] originally developed his theory 
of shape space. We describe here Kendall's shape space of m-landmark point sets in W 1 , denoted E™. For a more 
complete overview of Kendall's shape space, including computation of the curvature tensor, the reader is encouraged 
to consult the work of Kendall and Le [Kendall, 1984, Kendall, 1989, Le and Kendall, 1993]. 
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Let x = (a;j)j=i,...,m) %i € K d be a labelled pointset in M. d . Translate the pointset so that the centroid resides at 
0, removing any effects due to translations. Another degree of freedom due to scaling is removed by requiring that 
YhLi \\ x i\\ 2 = !• After this standardization, one is left with a point in S m , commonly referred to as preshape 
space in this context. 

Kendall shape space E™ is obtained by taking the quotient of the preshape space by the natural action of the 
rotation group SO(d) on pointsets. In practice, points in the quotient (refered to as shapes) are represented by a 
member of their equivalence class in preshape space. The work of O'Neill [O'Neill, 1966] characterizes the link in 
geometry between the shape and preshape spaces. We describe now how to compute exponential maps, log maps, and 
parallel transport in shape space, using representatives in S md ~ 1 . 

First, notice that there exist tangent vectors in 5' md ~ 1 corresponding to global rotations of the preshape. At a 
particular point p in preshape space, these vectors form a linear subspace of the tangent space T p S m , which we 
define as the vertical subspace. Curves moving along vertical tangent vectors result in rotations of a preshape, and so 
do not indicate any change in actual shape. Since we are concerned with shape change and wish to ignore effects due 
to rotation, it will be important to project such tangent vectors to be purely non-vertical, or horizontal. We denote the 
horizontal projection by U : TS md - 1 TS rnd - 1 . 

A vertical vector in preshape space arises as the derivative of a rotation of a preshape. The derivative of such a 
rotation is a skew-symmetric matrix W, and its action on apreshape x has the form (Wxi, . . . , Wx n ) € TS md ~' 1 . The 
vertical subspace is then spanned by such tangent vectors arising from any linearly independent set of skew-symmetric 
matrices. The projection T~L is performed by taking such a spanning set, performing Gram-Schmidt orthonormalization, 
and removing each component. 

The horizontal projection allows us to relate the covariant derivative on the sphere to that on shape space. Lemma 
1 of O'Neill [O'Neill, 1966] states that if X, Y are horizontal vector fields at some point p in preshape space, then 

HV X Y = V* X ,Y*, (59) 

where V denotes the covariant derivative on preshape space and V* , X*, and Y* are their counterparts in shape space. 

For a general shape space S™, the exponential map and parallel translation are performed using representatives in 
gmd-i j n p rac ti ce ^ this usually must be done in a time-stepping algorithm, in which at each time step an infinitesimal 
spherical parallel transport is performed, followed by the horizontal projection. The resulting algorithm can be used to 
compute the exponential map as well. Computation of the log map is less trivial, as it requires an iterative optimization 
routine. The target shape is first aligned to the base shape using Procrustes alignment. Then, at each iteration of the 
log map routine, the exponential map is integrated forward, compared to the target preshape, then the resulting tangent 
vector is adjoint parallel transported back to the base point, for update in a steepest descent iteration. 

A special case arises in the case when d = 2. In this case the exponential map, parallel transport and log map have 
closed form. The reader is encouraged to consult [Fletcher, 201 1] for more details about the two-dimensional case. 

With exponential map, log map, and parallel transport implemented as described here, one can perform polynomial 
regression on Kendall shape space in any dimension. We demonstrate such regression now on examples in two 
dimensions. 

3.3.1 Rat Calivarium Growth 

The first dataset we consider was first analyzed by Bookstein [Bookstein, 1991]. The data is available for download 
athttp://life.bio. sunysb . edu /mo rph/ data /data sets . html and consists of m — 8 landmarks on a 
midsagittal section of rat calivaria (upper skulls). Landmark positions are available for 18 rats and at 8 ages apiece. 
Riemannian polynomials of orders k = 0, 1, 2, 3 were fit to this data. The resulting curves are shown in Fig. 2 and in 
zoomed detail in Fig. 3. Clearly the quadratic and cubic curves differ from that of the geodesic regression. The R 2 
values agree with this qualitative difference: the geodesic regression has R 2 = 0.79, while the quadratic and cubic 
regressions both have R 2 values of 0.85 and 0.87, respectively. While this shows that there is a clear improvement 
in the fit due to increasing k from one to two, it also shows that little is gained by further increasing the order of the 
polynomial. Qualitatively, Fig. 3 shows that the slight increase in R 2 obtained by moving from a quadratic to cubic 
model corresponds to a marked difference in the curves, possibly indicating that the cubic curve is overfitting the data. 

3.3.2 Corpus Callosum Aging 

The corpus callosum is the major white matter bundle connecting the two hemispheres of the brain. In order to investi- 
gate shape change of the corpus callosum during normal aging, polynomial regression was performed on a collection of 
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Figure 2: Bookstein rat calivarium data after uniform scaling and Procrustes alignment. Color indicates age group. 
Zoomed views of individual rectangles are shown in Fig. 3. 

data from the OASIS brain database (www . oasis-brains . org). Magnetic resonance imaging (MRI) scans from 
32 normal subjects with ages between 19 and 90 years were obtained from the database. A midsagittal slice was ex- 
tracted from each volumetric image and segmented using the ITK-SNAP program (www . itksnap . org). Sets of 64 
landmarks were optimized on the contours of these segmentations using the ShapeWorks program [Cates et al., 21 ] 
(www . sci . utah . edu/sof tware . html). The algorithm generates samplings of each shape boundary with op- 
timal correspondences among the populaton. 

Regression results for geodesic, quadratic, and cubic regression are shown in Fig. 4. At first glance the results 
appear similar for the three different models, since the motion envelopes show close agreement. However, the R 2 
values show an improvement from geodesic to quadratic (from 0.12 to 0.13) and from quadratic to cubic (from 0.13 
to 0.21). Inspection of the estimated initial conditions, shown in Fig. 5 reveals that the tangent vectors appear to be 
rather collinear. For the reasons stated in Sec. 2.4, this suggests that the differences can essentially be described as 
reparametrization of the time variable, which is only accommodated in the higher order polynomial models. 



Appendix: Derivation of Adjoint Equations 

In this section we derive the adjoint system for the polynomial regression problem. The approach to calculus of 
variations followed here is outlined by, for example, Noakes et al. [Noakes et al., 1989]. Consider a simplified 
objective function containing only a single data term, at time T: 

E{ 1 ,{v l },{X l })^d{j{T),y) 2 + (A ,7- + V / (A,, VjV, - v i+1 )dt + / (\ k ,V^v k )dt. (60) 

Jo i=1 Jo Jo 
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Now consider taking variations of E with respect to the vector fields t>j. For each i there are only two terms containing 
Vi, so if W is a test vector field along 7, then the variation of E with respect to v.i in the direction W satisfies 

(5 Vi E,W)dt= { (Ai,V-vW)dt- / (V-i,, W)di. (61) 



The first term is integrated by parts to yield 

(^ i E,W)dt = (A i ,l^}|J- / (Vy\i,W)dt- [ (Xi-uW)dt. (62) 

J Jo 

The variation with respect to Vi for i = 1, . . . , k is then given by 

= = -V^A, - Ai_i, t e (0, T) (63) 
&Vi(T)E = = Ai(T) (64) 
5 M0) E = -Xi(t). (65) 

In order to determine the differential equation for Ao, the variation with respect to 7 must be computed. Let W again 
denote a test vector field along 7. For some e > 0, let {75 : s 6 (— e, e)} be a differentiable family of curves satisfying 

7o = 7 (66) 

^-~fs\ s =o=W. (67) 
as 
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If e is chosen small enough, the vector field W can be extended to a neighborhood of 7 such that [W, is] = 0, where 
a dot indicates the derivative in the direction. The vanishing Lie bracket implies the following identities 

Vw%=V* m W (68) 
V w Vj,=V^y w + R(W,j s ). (69) 

Finally, the vector fields Vi, Xi are extended along 7^ via parallel translation, so that 

v wVi = (70) 

V w Ai = 0. (71) 

The variation of E with respect to 7 satisfies 

T > 

{S y E, W)dt = -^(js, {v^, {\*})\s=o (72) 







d " T 



ds Jq 

As the Xi are extended via parallel translation, their inner products satisfy 

d 



ds 

-(Log 7(T) y, W(T)) + ^~ s J o (Ao, is - vx)dt\ a=0 (73) 

d k ~ 1 r T 

+ -rJ2 ( A *> V 7 S ^ -v i+ i)dt\ s= o (74) 
(Xk,V%v k )dt\ s=0 . (75) 



d " T 



, (X i ,U)\ s=0 = (V w X i ,U) + (X i ,V w U) = (X i ,V w U). (76) 
ds 



Then applying this to each term in the previous equation, 



(S 7 E,W)dt = -(Log {T) y,W(T)) + [ (A , V w -y - V w v x )dt (77) 

Jo 

fe— 1 „x 



y2 {Xi,V w V^Vi-VwVi+i)dt 
i=i 7 o 



(78) 



T 

(X k ,W w ^jV k )dt. (79) 



(80) 



Then by construction, since Vw'Vi = 0, 

/ (5 7 S,W)di = -(Log (T)2 /,W^(T))+ / (X ,V w j)dt + y2 (X^VwV^dt. 
Jo Jo i=1 Jo 

Then using the Lie bracket and curvature identities, this is written as 

i>T r-T fe />T 

/ (& y £,W)dt = -<Lo &r(T) y,W(T))+ / (A ,V^)rfi + V / (A,, V-yVwrv, + E(W l7 K)dt, (81) 
Jo Jo i=1 Jo 

which is further simplified, again using the identity Vw v i = 0: 

/ (S 7 E,W)dt = -(Log 7{T) y,W(T))+ / (A ,V^)^ + V/ (A,, 7)^)*, (82) 
Jo Jo i=1 Jo 

Using the Bianchi identities, it can be demonstrated that the curvature tensor satisfies the identity [do Carmo, 1992]: 

(A,R(B,C)D) = -{B,R(D,A)C), (83) 
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for any vectors A, B 7 C, D. The covariant derivative along 7 is also integrated by parts to arrive at 

____ 

/ (5 1 E,W)dt = -(Lo &y(T) y,W(T)) + {A ,W)\Z- / (W^X ,W)dt~Y. (R(vi,Xi)j,W}dt. (84) 
Jo Jo i=1 Jo 

Finally, gathering terms, the adjoint equation for Ao and its gradients are obtained: 

k 

6 l(t) E = = -V^\ -J2R(Vi,\i)i, ie(0,T) (85) 



»=i 



5 7 (T)-E = = -Log 7(T) y + A (86) 
S 7(0 )£ - -A . (87) 

Along with the variations with respect to Vi, this constitutes the full adjoint system. Extension to the case of multiple 
data at multiple time points is trivial, and results in the adjoint system presented in Sec. 2.3. 
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Figure 4: Geodesic (top, R 2 = 0.12) quadratic (middle, R 2 = 0.13) and cubic (bottom, R 2 = 0.21) regression for 
corpus callosum dataset. Color represents age, with blue indicating youth (age 19) and red indicating old age (age 90). 
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Figure 5: Estimated parameters for cubic regression of corpus callosum dataset. The velocity (black) is nearly collinear 
to the acceleration (blue) and jerk (red). 
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